Datum = assumed shape of the earth
Projection = how we map the coordinates to a flat plane for plotting. Shapefiles need a datum but do not need to have a projection. You can change the projection to plot your shapefile most accurately.
To crop plotfile, create polygon of location you want and then use st_intersect. Also can use st_crop.
Load packages
library(sf)
library(tidyverse)
require(knitr)
library(scales)
library(ggmap)
require(leaflet)
ak_regions <- read_sf("Data/shapefiles/ak_regions_simp.shp")
plot(ak_regions)
read_sf will look for all other files with the same name, but different extension to read in associated files like the .dbf, .prj and .shx files. You just have to pass it the .shp file for it to find and read these in.
What is the coordinate references system?
st_crs(ak_regions)
## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
class(ak_regions)
## [1] "sf" "tbl_df" "tbl" "data.frame"
There is no projection assigned to this data
Let’s assign a better projection. EPSG 3338 is a better projection for Alaska.
ak_regions_3338 <- ak_regions %>%
st_transform(crs=3338)
plot(ak_regions_3338)
kable(summary(ak_regions_3338))
| region_id | region | mgmt_area | geometry | |
|---|---|---|---|---|
| Min. : 1 | Length:13 | Min. :1 | MULTIPOLYGON :13 | |
| 1st Qu.: 4 | Class :character | 1st Qu.:2 | epsg:3338 : 0 | |
| Median : 7 | Mode :character | Median :3 | +proj=aea …: 0 | |
| Mean : 7 | NA | Mean :3 | NA | |
| 3rd Qu.:10 | NA | 3rd Qu.:4 | NA | |
| Max. :13 | NA | Max. :4 | NA |
Geometry column is sticky and will move with dataframe as you manipulate it. Even if you select 1 column not including geometry, resulting tibble will include geometry. You can manipulate the data without worrying about keeping track of the geometry column.
pop <- read_csv("Data/shapefiles/alaska_population.csv")
## Parsed with column specification:
## cols(
## year = col_double(),
## city = col_character(),
## lat = col_double(),
## lng = col_double(),
## population = col_double()
## )
kable(head(pop))
| year | city | lat | lng | population |
|---|---|---|---|---|
| 2015 | Adak | 51.88000 | -176.6581 | 122 |
| 2015 | Akhiok | 56.94556 | -154.1703 | 84 |
| 2015 | Akiachak | 60.90944 | -161.4314 | 562 |
| 2015 | Akiak | 60.91222 | -161.2139 | 399 |
| 2015 | Akutan | 54.13556 | -165.7731 | 899 |
| 2015 | Alakanuk | 62.68889 | -164.6153 | 777 |
Convert datafile to sf object
Not sure what coordinate system they used, but you can usually assume WGS84.
pop_4326 <- st_as_sf(pop,
coords = c("lng", "lat"), #Must be long then lat
crs = 4326,
remove = F) # Not the CRS you want, this is the CRS you have
st_crs(pop_4326)
## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
kable(head(pop_4326))
| year | city | lat | lng | population | geometry |
|---|---|---|---|---|---|
| 2015 | Adak | 51.88000 | -176.6581 | 122 | c(-176.6580556, 51.88) |
| 2015 | Akhiok | 56.94556 | -154.1703 | 84 | c(-154.1702778, 56.9455556) |
| 2015 | Akiachak | 60.90944 | -161.4314 | 562 | c(-161.4313889, 60.9094444) |
| 2015 | Akiak | 60.91222 | -161.2139 | 399 | c(-161.2138889, 60.9122222) |
| 2015 | Akutan | 54.13556 | -165.7731 | 899 | c(-165.7730556, 54.1355556) |
| 2015 | Alakanuk | 62.68889 | -164.6153 | 777 | c(-164.6152778, 62.6888889) |
coordinates were moved to geometry column. You can keep the coordinate columns as-is with the otion remove = F, or grab the coordinates back later with st_coordinates.
Convert population data to correct CRS.
pop_3338 <- pop_4326 %>%
st_transform(crs = 3338)
Join regions to population data
pop_joined <- st_join(pop_3338, ak_regions_3338, join = st_within)
kable(head(pop_joined))
| year | city | lat | lng | population | geometry | region_id | region | mgmt_area |
|---|---|---|---|---|---|---|---|---|
| 2015 | Adak | 51.88000 | -176.6581 | 122 | c(-1537925.19472109, 472627.758126839) | 1 | Aleutian Islands | 3 |
| 2015 | Akhiok | 56.94556 | -154.1703 | 84 | c(-10340.7104999533, 770998.359405536) | 6 | Kodiak | 3 |
| 2015 | Akiachak | 60.90944 | -161.4314 | 562 | c(-400885.468951876, 1236459.71282349) | 8 | Kuskokwim | 4 |
| 2015 | Akiak | 60.91222 | -161.2139 | 399 | c(-389165.656483337, 1235474.78457713) | 8 | Kuskokwim | 4 |
| 2015 | Akutan | 54.13556 | -165.7731 | 899 | c(-766425.670678856, 526057.778097302) | 1 | Aleutian Islands | 3 |
| 2015 | Alakanuk | 62.68889 | -164.6153 | 777 | c(-539724.905338151, 1456222.93092825) | 13 | Yukon | 4 |
Calculate total pop by region
pop_region <- pop_joined%>%
group_by(region)%>%
summarise(total_pop = sum(population))
kable(head(pop_region))
| region | total_pop | geometry |
|---|---|---|
| Aleutian Islands | 8840 | c(-1537925.19472109, -1366143.46472251, -994398.079393849, -973709.724977577, -944172.41501083, -821026.615979201, -766425.670678856, -602834.566822498, -553787.951103421, -529565.351144015, -448226.110521789, -411426.132714694, -280741.592696256, 472627.758126839, 452095.20169139, 436732.543720304, 910614.844423027, 843413.499977354, 506555.64402124, 526057.778097302, 580313.036406802, 611218.300605512, 594155.608587133, 689929.725545686, 612083.912027272, 779314.533965935) |
| Arctic | 8419 | c(-352590.341764316, -227854.370121449, -129336.425512205, -102348.548528926, 116301.930958642, 217275.221280422, 399087.27212233, 2220665.41118778, 2304943.29432865, 2279540.83308864, 2368023.05327234, 2251323.29653676, 2262083.0018213, 2270539.5924916) |
| Bristol Bay | 6947 | c(-363969.660069204, -358018.020127206, -289583.754461908, -262139.602759744, -261628.293155441, -254804.537006741, -213532.239557973, -213388.074862254, -202995.615565114, -197715.14395773, -196903.36361634, -187082.626325644, -184005.711656068, -173932.596815098, -173083.925200933, -163137.070449269, -153780.426195225, -107456.188503196, -51017.9711966115, -50297.4058717386, -47055.5460657353, -42725.5097986676, -17280.3549930402, -5936.65278282792, 1024888.1901217, 1026253.67177276, 1009430.06559634, |
| 1040120.99009605, 9 | 92024.797660 | 499, 1013471.84061753, 845800.312633587, 995955.400435192, 839531.020554467, 917832.390334361, 1044728.55825167, 1055764.08040571, 1086492.8015882, 974057.362538849, 972589.28884844, 1016751.74850314, 968720.918341307, 1038654.84654869, 1085443.57219176, 1081318.58327635, 1109639.71848696, 1050129.99026975, 1134979.02497252, 1088500.77132576) |
| Chignik | 311 | c(-342234.225108129, -322178.808314708, -294487.195998883, -279746.313987941, -271950.130000288, 668505.863132235, 668458.633487607, 704573.439707943, 709452.314309551, 707456.12691357) |
| Cook Inlet | 408254 | c(117813.26699129, 121345.439414341, 122992.425160414, 126241.059787485, 129356.398782612, 129362.065861185, 132562.639894305, 133555.252673053, 135898.689217597, 137077.940144802, 137730.87886673, 143835.851068997, 146192.782101147, 147515.983815201, 148023.119952292, 148696.380797416, 149804.64337332, 149920.362869036, 151513.445844241, 153957.513379527, 156107.169101553, 156509.592797846, 159364.915495099, 161055.907425384, 166137.37360009, 169750.196940579, 176849.640415057, 177178.478704853, |
| 184716.365363079, 1 | 94687.125874 | 653, 200511.941463689, 200932.426828086, 208488.934280451, 214221.03603782, 214391.188713405, 219351.071860349, 220843.264337677, 227988.311457449, 232324.611599445, 235427.544514776, 241182.29300749, 241433.166076421, 247479.811550306, 251737.069230804, 252365.679382743, 252670.026551085, 253432.372713383, 254062.740841565, 254511.292458267, 258276.96081981, 262948.112270659, 266873.032105922, 290718.522360232, 384652.244013451, 1042165.19128273, 1089297.03979731, |
| 1041771.6179726, 11 | 08544.517915 | 71, 1120308.86044262, 1051681.25020962, 1056268.29963608, 1093663.55686322, 1337549.56316271, 1078592.02082441, 1074855.0563817, 1140961.08377402, 1184066.66074515, 1192504.95908391, 1156518.79425289, 1162111.9259837, 1177393.64343675, 1153113.78422148, 1085866.21514648, 1235066.16992225, 1070262.95907023, 1243356.06181366, 1175282.21077194, 1170412.17662047, 1395522.29392843, 1100336.31446279, 1176683.30834551, 1172748.98866288, 1289786.15576041, 1376907.7576959, 1392083.10995902, |
| 1378077.13362572, 1 | 313927.81456 | 957, 1270856.86957824, 1288981.68063988, 1255296.65657145, 1301583.21188262, 1174246.13555575, 1301703.36637738, 1222948.84803521, 1297444.42631788, 1302568.58918523, 1220335.84074539, 1297220.81464198, 1132604.15215082, 1129075.20105229, 1175653.56440314, 1168378.8275541, 1139478.70508714, 1300717.99471823, 1294616.72607455, 1304322.16048775, 1325362.3834562, 1389032.56573601) |
| Copper River | 2294 | c(356847.573264316, 376983.082837351, 392581.831618292, 411998.010254397, 438715.865225621, 444841.017239757, 445528.117855156, 448515.087414687, 451821.252525683, 460229.134202899, 464580.897423723, 475492.942698169, 477472.993792258, 505439.029703051, 510345.009651461, 516353.047188967, 586365.726620148, 1347480.39022661, 1355617.9233319, 1365706.1603655, 1369980.80377912, 1376141.37551982, 1395222.65984282, 1371348.64024713, 1399150.0125542, 1362933.08953683, 1346308.10728011, 1328397.96273153, |
| 1337023.06720419, 1 | 433556.63847 | 495, 1318282.40400227, 1453660.03800478, 1478803.97938127, 1321622.33671479) |
Now geometry column contains points of every community within each region
Here’s a better way to do it
pop_region <- pop_joined%>%
as_tibble()%>% # add this to remove spatial class so that geometry column can be removed.
group_by(region)%>%
summarise(total_pop = sum(population))
kable(head(pop_region))
| region | total_pop |
|---|---|
| Aleutian Islands | 8840 |
| Arctic | 8419 |
| Bristol Bay | 6947 |
| Chignik | 311 |
| Cook Inlet | 408254 |
| Copper River | 2294 |
Add total_pops to the region spatial dataframe
pop_region_3338 <- left_join(ak_regions_3338, pop_region, by = "region")
plot(pop_region_3338)
Calculate population by management area
pop_mgmt <- pop_region_3338%>%
group_by(mgmt_area)%>%
summarise(total_pop=sum(total_pop))
plot(pop_mgmt["total_pop"])
So you can group_by and summarise a spatial dataframe and it retains the geometry. To keep internal boundaries between these polygons, set do_union = F in summarise function.
Load rivers shapefile
rivers_3338<-read_sf("Data/shapefiles/ak_rivers_simp.shp")
st_crs(rivers_3338)
## Coordinate Reference System:
## No EPSG code
## proj4string: "+proj=aea +lat_1=55 +lat_2=65 +lat_0=50 +lon_0=-154 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs"
No EPSG code, but CRS defined in the proj4string in “+proj=aea.” CRS is 3338
ggplot() +
geom_sf(data=pop_region_3338, aes(fill = total_pop))+
geom_sf(data=pop_3338, size=0.5)+
geom_sf(data=rivers_3338, aes(size=StrOrder), color="black")+
scale_size(range=c(0.01, 0.2), guide=F)+
scale_fill_continuous(low="khaki", high="firebrick", name="Total population", labels=comma)+
theme_bw()
First need to change projection to make population data compatible with tile data for base maps.
pop_3857 <- pop_3338 %>%
st_transform(crs=3857)
Define a function to fix the bounding box to be in EPSG:3857
# See https://github.com/dkahle/ggmap/issues/160#issuecomment-397055208
ggmap_bbox_to_3857 <- function(map) {
if (!inherits(map, "ggmap")) stop("map must be a ggmap object")
# Extract the bounding box (in lat/lon) from the ggmap to a numeric vector,
# and set the names to what sf::st_bbox expects:
map_bbox <- setNames(unlist(attr(map, "bb")),
c("ymin", "xmin", "ymax", "xmax"))
# Coonvert the bbox to an sf polygon, transform it to 3857,
# and convert back to a bbox (convoluted, but it works)
bbox_3857 <- st_bbox(st_transform(st_as_sfc(st_bbox(map_bbox, crs = 4326)), 3857))
# Overwrite the bbox of the ggmap object with the transformed coordinates
attr(map, "bb")$ll.lat <- bbox_3857["ymin"]
attr(map, "bb")$ll.lon <- bbox_3857["xmin"]
attr(map, "bb")$ur.lat <- bbox_3857["ymax"]
attr(map, "bb")$ur.lon <- bbox_3857["xmax"]
map
}
Create a bounding box
bbox <- c(-170, 52, -130, 64)
ak_map <- get_stamenmap(bbox, zoom=4)
ak_map_3857 <- ggmap_bbox_to_3857(ak_map)
Make plot with basemap
ggmap(ak_map_3857) +
geom_sf(data = pop_3857, aes(color=population), inherit.aes = F)+
scale_color_continuous(low = "khaki", high = "firebrick", labels = comma)
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Must give leaflet a CRS
epsg3338 <- leafletCRS(
crsClass = "L.Proj.CRS",
code = "EPSG:3338",
proj4def = "+proj=aea +lat_1=55 +lat_2=65 +lat_0=50 +lon_0=-154 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs",
resolutions = 2^(16:7)
)
And must give leaflet your data in EPSG 4326
Transform data to 4326
pop_region_4326 <- pop_region_3338%>%
st_transform(crs=4326)
Plot
leaflet(options=leafletOptions(crs = epsg3338))%>%
addPolygons(data=pop_region_4326,
fillColor = "gray",
weight=1,
)
Add population to plot
pal <- colorNumeric(palette = "Reds", domain = pop_region_4326$total_pop)
m <- leaflet(options = leafletOptions(crs = epsg3338)) %>%
addPolygons(data = pop_region_4326,
fillColor = ~pal(total_pop),
weight = 1,
color = "black",
fillOpacity = 1,
label = ~region) %>%
addLegend(position = "bottomleft",
pal = pal,
values = range(pop_region_4326$total_pop),
title = "Total Population")
m
Make plot more complex and add individual communities with their populations.
pal <- colorNumeric(palette = "Reds", domain = pop_region_4326$total_pop)
m <- leaflet(options = leafletOptions(crs = epsg3338)) %>%
addPolygons(data = pop_region_4326,
fillColor = ~pal(total_pop),
weight = 1,
color = "black",
fillOpacity = 1,
label = ~region) %>%
addCircleMarkers(data = pop_4326,
lat = ~lat,
lng = ~lng,
radius = ~log(population/500), # arbitrary scaling
fillColor = "gray",
fillOpacity = 1,
weight = 0.25,
color = "black",
label = ~paste0(pop_4326$city, ", population ", comma(pop_4326$population))) %>%
addLegend(position = "bottomleft",
pal = pal,
values = range(pop_region_4326$total_pop),
title = "Total Population")
m
With built in projections, leaflet can no longer properly grab tiles from tile servers.